Overview

Dataset statistics

Number of variables13
Number of observations2968
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory290.0 KiB
Average record size in memory100.0 B

Variable types

Numeric13

Alerts

gross_revenue is highly correlated with qty_invoices and 3 other fieldsHigh correlation
recency_days is highly correlated with qty_invoicesHigh correlation
qty_invoices is highly correlated with gross_revenue and 4 other fieldsHigh correlation
qty_itemns is highly correlated with gross_revenue and 3 other fieldsHigh correlation
qty_products is highly correlated with gross_revenue and 2 other fieldsHigh correlation
avg_recency_days is highly correlated with frequencyHigh correlation
frequency is highly correlated with avg_recency_daysHigh correlation
avg_basket_size is highly correlated with gross_revenue and 1 other fieldsHigh correlation
avg_unique_basket_size is highly correlated with qty_invoicesHigh correlation
gross_revenue is highly correlated with qty_invoices and 1 other fieldsHigh correlation
qty_invoices is highly correlated with gross_revenue and 2 other fieldsHigh correlation
qty_itemns is highly correlated with gross_revenue and 1 other fieldsHigh correlation
qty_products is highly correlated with qty_invoicesHigh correlation
avg_ticket is highly correlated with qty_returns and 1 other fieldsHigh correlation
qty_returns is highly correlated with avg_ticketHigh correlation
avg_basket_size is highly correlated with avg_ticketHigh correlation
gross_revenue is highly correlated with qty_itemns and 1 other fieldsHigh correlation
qty_invoices is highly correlated with qty_itemns and 1 other fieldsHigh correlation
qty_itemns is highly correlated with gross_revenue and 3 other fieldsHigh correlation
qty_products is highly correlated with gross_revenue and 1 other fieldsHigh correlation
avg_recency_days is highly correlated with frequency and 1 other fieldsHigh correlation
frequency is highly correlated with avg_recency_daysHigh correlation
qty_returns is highly correlated with avg_recency_daysHigh correlation
avg_basket_size is highly correlated with qty_itemnsHigh correlation
avg_unique_basket_size is highly correlated with qty_invoicesHigh correlation
gross_revenue is highly correlated with qty_invoices and 4 other fieldsHigh correlation
qty_invoices is highly correlated with gross_revenue and 3 other fieldsHigh correlation
qty_itemns is highly correlated with gross_revenue and 4 other fieldsHigh correlation
qty_products is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_ticket is highly correlated with qty_returns and 1 other fieldsHigh correlation
qty_returns is highly correlated with gross_revenue and 5 other fieldsHigh correlation
avg_basket_size is highly correlated with gross_revenue and 4 other fieldsHigh correlation
avg_unique_basket_size is highly correlated with avg_basket_sizeHigh correlation
avg_ticket is highly skewed (γ1 = 25.1569664) Skewed
frequency is highly skewed (γ1 = 24.87687084) Skewed
qty_returns is highly skewed (γ1 = 21.9754032) Skewed
df_index has unique values Unique
customer_id has unique values Unique
recency_days has 33 (1.1%) zeros Zeros
qty_returns has 1481 (49.9%) zeros Zeros

Reproduction

Analysis started2022-05-30 21:48:50.031498
Analysis finished2022-05-30 21:50:10.034738
Duration1 minute and 20 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct2968
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2316.666442
Minimum0
Maximum5714
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-05-30T18:50:12.245137image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile185.35
Q1928.5
median2119.5
Q33536.25
95-th percentile5034.3
Maximum5714
Range5714
Interquartile range (IQR)2607.75

Descriptive statistics

Standard deviation1554.722712
Coefficient of variation (CV)0.6711033938
Kurtosis-1.010637904
Mean2316.666442
Median Absolute Deviation (MAD)1270.5
Skewness0.3426249769
Sum6875866
Variance2417162.71
MonotonicityStrictly increasing
2022-05-30T18:50:12.966331image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
30101
 
< 0.1%
29951
 
< 0.1%
29961
 
< 0.1%
29991
 
< 0.1%
30001
 
< 0.1%
30011
 
< 0.1%
30021
 
< 0.1%
30051
 
< 0.1%
30071
 
< 0.1%
Other values (2958)2958
99.7%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
57141
< 0.1%
56951
< 0.1%
56851
< 0.1%
56791
< 0.1%
56581
< 0.1%
56541
< 0.1%
56481
< 0.1%
56371
< 0.1%
56361
< 0.1%
56261
< 0.1%

customer_id
Real number (ℝ≥0)

UNIQUE

Distinct2968
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15270.37702
Minimum12347
Maximum18287
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.7 KiB
2022-05-30T18:50:13.255154image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum12347
5-th percentile12619.35
Q113798.75
median15220.5
Q316768.5
95-th percentile17964.65
Maximum18287
Range5940
Interquartile range (IQR)2969.75

Descriptive statistics

Standard deviation1719.144523
Coefficient of variation (CV)0.1125803587
Kurtosis-1.206178196
Mean15270.37702
Median Absolute Deviation (MAD)1489
Skewness0.03219371129
Sum45322479
Variance2955457.892
MonotonicityNot monotonic
2022-05-30T18:50:13.526984image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
178501
 
< 0.1%
126701
 
< 0.1%
177341
 
< 0.1%
149051
 
< 0.1%
161031
 
< 0.1%
146261
 
< 0.1%
148681
 
< 0.1%
182461
 
< 0.1%
171151
 
< 0.1%
166111
 
< 0.1%
Other values (2958)2958
99.7%
ValueCountFrequency (%)
123471
< 0.1%
123481
< 0.1%
123521
< 0.1%
123561
< 0.1%
123581
< 0.1%
123591
< 0.1%
123601
< 0.1%
123621
< 0.1%
123641
< 0.1%
123701
< 0.1%
ValueCountFrequency (%)
182871
< 0.1%
182831
< 0.1%
182821
< 0.1%
182771
< 0.1%
182761
< 0.1%
182741
< 0.1%
182731
< 0.1%
182721
< 0.1%
182701
< 0.1%
182691
< 0.1%

gross_revenue
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2953
Distinct (%)99.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2693.485061
Minimum6.2
Maximum279138.02
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-05-30T18:50:13.818806image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum6.2
5-th percentile229.7325
Q1570.845
median1085.51
Q32306.905
95-th percentile7169.562
Maximum279138.02
Range279131.82
Interquartile range (IQR)1736.06

Descriptive statistics

Standard deviation10135.46528
Coefficient of variation (CV)3.762955818
Kurtosis397.3013221
Mean2693.485061
Median Absolute Deviation (MAD)670.84
Skewness17.63537227
Sum7994263.66
Variance102727656.5
MonotonicityNot monotonic
2022-05-30T18:50:14.302390image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1078.962
 
0.1%
2053.022
 
0.1%
3312
 
0.1%
1353.742
 
0.1%
889.932
 
0.1%
745.062
 
0.1%
379.652
 
0.1%
2092.322
 
0.1%
731.92
 
0.1%
734.942
 
0.1%
Other values (2943)2948
99.3%
ValueCountFrequency (%)
6.21
< 0.1%
13.31
< 0.1%
151
< 0.1%
36.561
< 0.1%
451
< 0.1%
521
< 0.1%
52.21
< 0.1%
52.21
< 0.1%
62.431
< 0.1%
68.841
< 0.1%
ValueCountFrequency (%)
279138.021
< 0.1%
259657.31
< 0.1%
194550.791
< 0.1%
140450.721
< 0.1%
124564.531
< 0.1%
117379.631
< 0.1%
91062.381
< 0.1%
72882.091
< 0.1%
66653.561
< 0.1%
65039.621
< 0.1%

recency_days
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct272
Distinct (%)9.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean64.30929919
Minimum0
Maximum373
Zeros33
Zeros (%)1.1%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-05-30T18:50:14.664164image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q111
median31
Q381
95-th percentile242
Maximum373
Range373
Interquartile range (IQR)70

Descriptive statistics

Standard deviation77.76092244
Coefficient of variation (CV)1.209170733
Kurtosis2.776517247
Mean64.30929919
Median Absolute Deviation (MAD)26
Skewness1.798052889
Sum190870
Variance6046.761059
MonotonicityNot monotonic
2022-05-30T18:50:14.985963image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
199
 
3.3%
487
 
2.9%
285
 
2.9%
385
 
2.9%
876
 
2.6%
1067
 
2.3%
966
 
2.2%
766
 
2.2%
1764
 
2.2%
1655
 
1.9%
Other values (262)2218
74.7%
ValueCountFrequency (%)
033
 
1.1%
199
3.3%
285
2.9%
385
2.9%
487
2.9%
543
1.4%
766
2.2%
876
2.6%
966
2.2%
1067
2.3%
ValueCountFrequency (%)
3732
0.1%
3724
0.1%
3711
 
< 0.1%
3681
 
< 0.1%
3664
0.1%
3652
0.1%
3641
 
< 0.1%
3601
 
< 0.1%
3591
 
< 0.1%
3584
0.1%

qty_invoices
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct56
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.724393531
Minimum1
Maximum206
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-05-30T18:50:15.271788image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile17
Maximum206
Range205
Interquartile range (IQR)4

Descriptive statistics

Standard deviation8.857759893
Coefficient of variation (CV)1.547370886
Kurtosis190.7862392
Mean5.724393531
Median Absolute Deviation (MAD)2
Skewness10.76555481
Sum16990
Variance78.45991032
MonotonicityNot monotonic
2022-05-30T18:50:15.544619image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2784
26.4%
3499
16.8%
4393
13.2%
5237
 
8.0%
1190
 
6.4%
6173
 
5.8%
7138
 
4.6%
898
 
3.3%
969
 
2.3%
1055
 
1.9%
Other values (46)332
11.2%
ValueCountFrequency (%)
1190
 
6.4%
2784
26.4%
3499
16.8%
4393
13.2%
5237
 
8.0%
6173
 
5.8%
7138
 
4.6%
898
 
3.3%
969
 
2.3%
1055
 
1.9%
ValueCountFrequency (%)
2061
< 0.1%
1991
< 0.1%
1241
< 0.1%
971
< 0.1%
912
0.1%
861
< 0.1%
721
< 0.1%
622
0.1%
601
< 0.1%
571
< 0.1%

qty_itemns
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1670
Distinct (%)56.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1582.104447
Minimum1
Maximum196844
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-05-30T18:50:15.804457image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile102.35
Q1296
median640
Q31399.5
95-th percentile4403.25
Maximum196844
Range196843
Interquartile range (IQR)1103.5

Descriptive statistics

Standard deviation5705.291445
Coefficient of variation (CV)3.60614083
Kurtosis516.7418024
Mean1582.104447
Median Absolute Deviation (MAD)421
Skewness18.73765362
Sum4695686
Variance32550350.48
MonotonicityNot monotonic
2022-05-30T18:50:16.056305image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
31011
 
0.4%
1509
 
0.3%
889
 
0.3%
2468
 
0.3%
2728
 
0.3%
848
 
0.3%
2608
 
0.3%
2888
 
0.3%
12007
 
0.2%
5167
 
0.2%
Other values (1660)2885
97.2%
ValueCountFrequency (%)
11
< 0.1%
22
0.1%
122
0.1%
161
< 0.1%
171
< 0.1%
181
< 0.1%
191
< 0.1%
201
< 0.1%
231
< 0.1%
251
< 0.1%
ValueCountFrequency (%)
1968441
< 0.1%
802631
< 0.1%
773731
< 0.1%
699931
< 0.1%
645491
< 0.1%
641241
< 0.1%
633121
< 0.1%
583431
< 0.1%
578851
< 0.1%
502551
< 0.1%

qty_products
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct468
Distinct (%)15.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean122.7644879
Minimum1
Maximum7838
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-05-30T18:50:16.331137image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile9
Q129
median67
Q3135
95-th percentile382
Maximum7838
Range7837
Interquartile range (IQR)106

Descriptive statistics

Standard deviation269.9329358
Coefficient of variation (CV)2.198786803
Kurtosis354.7788412
Mean122.7644879
Median Absolute Deviation (MAD)44
Skewness15.7061352
Sum364365
Variance72863.78981
MonotonicityNot monotonic
2022-05-30T18:50:16.579325image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2843
 
1.4%
2037
 
1.2%
3535
 
1.2%
2935
 
1.2%
1934
 
1.1%
1533
 
1.1%
1132
 
1.1%
2631
 
1.0%
2730
 
1.0%
2530
 
1.0%
Other values (458)2628
88.5%
ValueCountFrequency (%)
16
 
0.2%
214
0.5%
315
0.5%
417
0.6%
526
0.9%
629
1.0%
718
0.6%
819
0.6%
926
0.9%
1028
0.9%
ValueCountFrequency (%)
78381
< 0.1%
56731
< 0.1%
50951
< 0.1%
45801
< 0.1%
26981
< 0.1%
23791
< 0.1%
20601
< 0.1%
18181
< 0.1%
16731
< 0.1%
16371
< 0.1%

avg_ticket
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct2965
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.99425671
Minimum2.150588235
Maximum4453.43
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-05-30T18:50:17.225922image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2.150588235
5-th percentile4.915887985
Q113.11811111
median17.95344712
Q324.98179365
95-th percentile90.052125
Maximum4453.43
Range4451.279412
Interquartile range (IQR)11.86368254

Descriptive statistics

Standard deviation119.5320656
Coefficient of variation (CV)3.622814318
Kurtosis812.9647397
Mean32.99425671
Median Absolute Deviation (MAD)5.979018644
Skewness25.1569664
Sum97926.95393
Variance14287.91471
MonotonicityNot monotonic
2022-05-30T18:50:17.461777image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
152
 
0.1%
4.1622
 
0.1%
14.478333332
 
0.1%
18.152222221
 
< 0.1%
13.927368421
 
< 0.1%
36.244117651
 
< 0.1%
29.784166671
 
< 0.1%
22.87926231
 
< 0.1%
20.511041671
 
< 0.1%
149.0251
 
< 0.1%
Other values (2955)2955
99.6%
ValueCountFrequency (%)
2.1505882351
< 0.1%
2.43251
< 0.1%
2.4623711341
< 0.1%
2.5112413791
< 0.1%
2.5153333331
< 0.1%
2.651
< 0.1%
2.6569318181
< 0.1%
2.7075982531
< 0.1%
2.7606215721
< 0.1%
2.7704641911
< 0.1%
ValueCountFrequency (%)
4453.431
< 0.1%
3202.921
< 0.1%
1687.21
< 0.1%
952.98751
< 0.1%
872.131
< 0.1%
841.02144931
< 0.1%
651.16833331
< 0.1%
6401
< 0.1%
624.41
< 0.1%
615.751
< 0.1%

avg_recency_days
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct1258
Distinct (%)42.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean67.30213285
Minimum1
Maximum366
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-05-30T18:50:17.748234image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile8
Q125.91730769
median48.26785714
Q385.33333333
95-th percentile200.65
Maximum366
Range365
Interquartile range (IQR)59.41602564

Descriptive statistics

Standard deviation63.50535844
Coefficient of variation (CV)0.9435861206
Kurtosis4.908048776
Mean67.30213285
Median Absolute Deviation (MAD)26.26785714
Skewness2.066084007
Sum199752.7303
Variance4032.93055
MonotonicityNot monotonic
2022-05-30T18:50:17.993081image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1425
 
0.8%
422
 
0.7%
7021
 
0.7%
720
 
0.7%
3519
 
0.6%
4918
 
0.6%
1117
 
0.6%
4617
 
0.6%
2117
 
0.6%
2816
 
0.5%
Other values (1248)2776
93.5%
ValueCountFrequency (%)
116
0.5%
1.51
 
< 0.1%
213
0.4%
2.51
 
< 0.1%
2.6013986011
 
< 0.1%
315
0.5%
3.3214285711
 
< 0.1%
3.3303571431
 
< 0.1%
3.52
 
0.1%
422
0.7%
ValueCountFrequency (%)
3661
 
< 0.1%
3651
 
< 0.1%
3631
 
< 0.1%
3621
 
< 0.1%
3572
0.1%
3561
 
< 0.1%
3552
0.1%
3521
 
< 0.1%
3512
0.1%
3503
0.1%

frequency
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct1225
Distinct (%)41.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1138323742
Minimum0.005449591281
Maximum17
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-05-30T18:50:18.254985image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.005449591281
5-th percentile0.008893504781
Q10.01633986928
median0.02589835169
Q30.04947858264
95-th percentile1
Maximum17
Range16.99455041
Interquartile range (IQR)0.03313871336

Descriptive statistics

Standard deviation0.4082205551
Coefficient of variation (CV)3.586155151
Kurtosis989.0663249
Mean0.1138323742
Median Absolute Deviation (MAD)0.0121968864
Skewness24.87687084
Sum337.8544866
Variance0.1666440216
MonotonicityNot monotonic
2022-05-30T18:50:18.503837image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1198
 
6.7%
0.062518
 
0.6%
0.0277777777817
 
0.6%
0.0238095238116
 
0.5%
0.0909090909115
 
0.5%
0.0833333333315
 
0.5%
0.0344827586214
 
0.5%
0.0294117647114
 
0.5%
0.0357142857113
 
0.4%
0.0769230769213
 
0.4%
Other values (1215)2635
88.8%
ValueCountFrequency (%)
0.0054495912811
 
< 0.1%
0.0054644808741
 
< 0.1%
0.0054794520551
 
< 0.1%
0.0054945054951
 
< 0.1%
0.0055865921792
0.1%
0.0056022408961
 
< 0.1%
0.0056179775282
0.1%
0.005665722381
 
< 0.1%
0.0056818181822
0.1%
0.0056980056983
0.1%
ValueCountFrequency (%)
171
 
< 0.1%
31
 
< 0.1%
26
 
0.2%
1.1428571431
 
< 0.1%
1198
6.7%
0.751
 
< 0.1%
0.66666666673
 
0.1%
0.5508021391
 
< 0.1%
0.53351206431
 
< 0.1%
0.53
 
0.1%

qty_returns
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct213
Distinct (%)7.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean34.88847709
Minimum0
Maximum9014
Zeros1481
Zeros (%)49.9%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-05-30T18:50:18.776664image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q39
95-th percentile100
Maximum9014
Range9014
Interquartile range (IQR)9

Descriptive statistics

Standard deviation282.864784
Coefficient of variation (CV)8.107685048
Kurtosis596.2019916
Mean34.88847709
Median Absolute Deviation (MAD)1
Skewness21.9754032
Sum103549
Variance80012.48604
MonotonicityNot monotonic
2022-05-30T18:50:19.030516image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01481
49.9%
1164
 
5.5%
2148
 
5.0%
3105
 
3.5%
489
 
3.0%
678
 
2.6%
561
 
2.1%
1251
 
1.7%
743
 
1.4%
843
 
1.4%
Other values (203)705
23.8%
ValueCountFrequency (%)
01481
49.9%
1164
 
5.5%
2148
 
5.0%
3105
 
3.5%
489
 
3.0%
561
 
2.1%
678
 
2.6%
743
 
1.4%
843
 
1.4%
941
 
1.4%
ValueCountFrequency (%)
90141
< 0.1%
80041
< 0.1%
44271
< 0.1%
37681
< 0.1%
33321
< 0.1%
28781
< 0.1%
20221
< 0.1%
20121
< 0.1%
17761
< 0.1%
15941
< 0.1%

avg_basket_size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1978
Distinct (%)66.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean236.252886
Minimum1
Maximum6009.333333
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-05-30T18:50:19.291351image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile44
Q1103.2375
median172.2916667
Q3281.5480769
95-th percentile599.58
Maximum6009.333333
Range6008.333333
Interquartile range (IQR)178.3105769

Descriptive statistics

Standard deviation283.8931966
Coefficient of variation (CV)1.201649645
Kurtosis102.7816879
Mean236.252886
Median Absolute Deviation (MAD)83.04166667
Skewness7.701877717
Sum701198.5657
Variance80595.34706
MonotonicityNot monotonic
2022-05-30T18:50:19.534203image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10011
 
0.4%
11410
 
0.3%
829
 
0.3%
869
 
0.3%
739
 
0.3%
1368
 
0.3%
758
 
0.3%
608
 
0.3%
888
 
0.3%
1307
 
0.2%
Other values (1968)2881
97.1%
ValueCountFrequency (%)
12
0.1%
21
< 0.1%
3.3333333331
< 0.1%
5.3333333331
< 0.1%
5.6666666671
< 0.1%
6.1428571431
< 0.1%
7.51
< 0.1%
91
< 0.1%
9.51
< 0.1%
111
< 0.1%
ValueCountFrequency (%)
6009.3333331
< 0.1%
42821
< 0.1%
39061
< 0.1%
3868.651
< 0.1%
28801
< 0.1%
28011
< 0.1%
2733.9444441
< 0.1%
2518.7692311
< 0.1%
2160.3333331
< 0.1%
2082.2258061
< 0.1%

avg_unique_basket_size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct268
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.118996421
Minimum0.1764705882
Maximum16
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2022-05-30T18:50:19.818735image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.1764705882
5-th percentile0.95
Q11.8
median2.75
Q34
95-th percentile6.5
Maximum16
Range15.82352941
Interquartile range (IQR)2.2

Descriptive statistics

Standard deviation1.833650164
Coefficient of variation (CV)0.5878974888
Kurtosis3.642014851
Mean3.118996421
Median Absolute Deviation (MAD)1.083333333
Skewness1.430768983
Sum9257.181378
Variance3.362272923
MonotonicityNot monotonic
2022-05-30T18:50:20.070225image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3204
 
6.9%
2200
 
6.7%
4163
 
5.5%
5142
 
4.8%
3.5136
 
4.6%
4.5113
 
3.8%
2.5111
 
3.7%
682
 
2.8%
3.33333333371
 
2.4%
170
 
2.4%
Other values (258)1676
56.5%
ValueCountFrequency (%)
0.17647058821
 
< 0.1%
0.22110552761
 
< 0.1%
0.27272727271
 
< 0.1%
0.27669902911
 
< 0.1%
0.27906976741
 
< 0.1%
0.28205128211
 
< 0.1%
0.33064516131
 
< 0.1%
0.33333333334
0.1%
0.34020618561
 
< 0.1%
0.36263736261
 
< 0.1%
ValueCountFrequency (%)
161
 
< 0.1%
143
 
0.1%
13.51
 
< 0.1%
121
 
< 0.1%
119
 
0.3%
107
 
0.2%
9.51
 
< 0.1%
916
0.5%
8.52
 
0.1%
834
1.1%

Interactions

2022-05-30T18:50:03.878550image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:23.104278image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:27.706501image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:30.946561image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:34.394958image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:37.454085image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:40.695833image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:43.894185image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:47.119750image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:50.270530image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:53.988617image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:57.497701image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:00.850205image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:04.099426image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:24.358358image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:27.942358image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:31.179434image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:34.626818image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:37.704931image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:40.916923image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:44.145033image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:47.323616image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:50.509383image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:54.217468image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:57.710573image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:01.070084image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:04.336397image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:24.801207image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:28.172345image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:31.415855image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:34.847684image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:37.933788image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:41.150239image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:44.376885image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:47.538478image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:50.805896image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:54.447331image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:57.943295image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:01.284964image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:04.550550image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:25.030830image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:28.442177image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:31.636880image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:35.078544image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:38.131673image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:41.429576image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:44.604751image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:47.986206image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:51.026760image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:54.667193image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:58.179148image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:01.504501image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:04.804747image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:25.367989image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:28.689273image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:31.870153image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:35.313398image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:38.366540image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:41.683419image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:44.833606image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:48.228696image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:51.260619image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:55.046955image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:58.403011image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:01.740350image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:05.029088image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:25.640917image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:28.908136image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:32.082023image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:35.531264image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:38.793830image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:41.898286image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:45.060466image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:48.424579image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:51.533446image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:55.296802image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:58.597904image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:01.970210image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:05.280191image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:25.887763image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:29.159185image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:32.349858image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:35.777108image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:39.017872image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:42.139125image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:45.320306image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:48.670431image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:51.876233image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:55.531757image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:58.825503image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:02.249043image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:05.523047image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:26.159594image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:29.408031image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:32.636316image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:36.013960image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:39.256836image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:42.404964image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:45.590359image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:48.902279image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:52.440826image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:55.979027image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:59.075348image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:02.500837image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:05.751537image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:26.427427image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:29.615909image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:32.876637image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:36.230833image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:39.484692image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:42.651860image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:45.824212image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:49.098538image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:52.656065image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:56.209029image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:59.286226image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:02.708708image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:05.988398image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:26.676275image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:29.855751image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:33.130035image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:36.480675image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:39.770513image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:42.886717image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:46.074646image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:49.310408image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:52.938657image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:56.451252image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:59.514082image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:02.941567image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:06.241233image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:26.898370image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:30.099608image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:33.441844image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:36.714530image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:40.017869image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:43.135649image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:46.317506image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:49.581246image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:53.301431image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:56.718084image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:00.138320image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:03.182441image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:06.456100image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:27.191763image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:30.318901image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:33.885548image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:36.930398image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:40.248086image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:43.353520image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:46.627297image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:49.786125image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:53.515300image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:56.967027image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:00.335200image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:03.404306image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:06.701640image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:27.470649image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:30.547774image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:34.149107image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:37.168250image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:40.466960image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:43.620352image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:46.867893image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:50.024681image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:53.748161image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:49:57.217875image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:00.602042image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-05-30T18:50:03.639670image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-05-30T18:50:20.392175image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-05-30T18:50:20.856577image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-05-30T18:50:21.229419image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-05-30T18:50:21.596761image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-05-30T18:50:07.204592image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-05-30T18:50:07.909530image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexcustomer_idgross_revenuerecency_daysqty_invoicesqty_itemnsqty_productsavg_ticketavg_recency_daysfrequencyqty_returnsavg_basket_sizeavg_unique_basket_size
00178505391.21372.034.01733.0297.018.15222235.50000017.00000040.050.9705880.176471
11130473232.5956.09.01390.0171.018.90403527.2500000.02830235.0154.4444441.222222
22125836705.382.015.05028.0232.028.90250023.1875000.04032350.0335.2000001.600000
3313748948.2595.05.0439.028.033.86607192.6666670.0179210.087.8000001.600000
4415100876.00333.03.080.03.0292.0000008.6000000.07317122.026.6666670.666667
55152914623.3025.014.02102.0102.045.32647123.2000000.04011529.0150.1428571.214286
66146885630.877.021.03621.0327.017.21978618.3000000.057221399.0172.4285711.142857
77178095411.9116.012.02057.061.088.71983635.7000000.03352041.0171.4166671.916667
881531160767.900.091.038194.02379.025.5434644.1444440.243316474.0419.7142860.472527
99160982005.6387.07.0613.067.029.93477647.6666670.0243900.087.5714292.142857

Last rows

df_indexcustomer_idgross_revenuerecency_daysqty_invoicesqty_itemnsqty_productsavg_ticketavg_recency_daysfrequencyqty_returnsavg_basket_sizeavg_unique_basket_size
29585626177271060.2515.01.0645.066.016.0643946.01.0000006.0645.00000011.0
2959563617232421.522.02.0203.036.011.70888912.00.1538460.0101.5000005.0
2960563717468137.0010.02.0116.05.027.4000004.00.4000000.058.0000001.0
2961564813596697.045.02.0406.0166.04.1990367.00.2500000.0203.0000005.0
29625654148931237.859.02.0799.073.016.9568492.00.6666670.0399.5000007.0
2963565812479473.2011.01.0382.030.015.7733334.01.00000034.0382.0000008.0
2964567914126706.137.03.0508.015.047.0753333.00.75000050.0169.3333332.0
29655685135211092.391.03.0733.0435.02.5112414.50.3000000.0244.3333333.0
2966569515060301.848.04.0262.0120.02.5153331.02.0000000.065.5000002.0
2967571412558269.967.01.0196.011.024.5418186.01.000000196.0196.0000005.0